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ABSTRACT 


Preprogrammed monitoring of engine failure due to spark plug misfire can be 
traced using a method called machine learning. Unluckily, a challenge to get 
a high-efficiency rate because of a massive volume of training data is 
required. During the study, these failure-generated were enhanced with a 
novel statistical signal-based analysis called Z-freq to improve the 
exploration. This study is an exploration of the time and frequency content 
attained from the engine after it goes under a specific situation. Throughout 
the trial, the misfire was formed by cutting the voltage supplied to simulate 
the actual outcome of the worn-out spark plug. The failure produced by fault 
signals from the spark plug misfire were collected using great sensitivity, 
space-saving and a robust piezo-based sensor named accelerometer. The 
achieved result and analysis indicated a significant pattern in the coefficient 
value and scattering of Z-freq data for spark plug misfire. Lastly, the 
simulation and experimental output were proved and endorsed in a series of 


performance metrics tests using accuracy, sensitivity, and specificity for 
prediction purposes. Finally, it confirmed that the proposed technique 
capably to make a diagnosis: fault detection, fault localization, and fault 
severity classification. 
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1. INTRODUCTION 

Vibration monitoring of fault activities has been established and broadly used in the engineering 
sector, specifically in researches on structural health monitoring, mainly rotating mechanisms [1]—[5]. One of 
the most significant discussion topics in an automobile is the trustworthiness of the vehicle engine, and it 
becomes a crucial issue for producers to guarantee customers satisfaction. Smart detection, localization, 
and classification of several errors are the essential steps toward automated monitoring plus predictive 
systems [6]—[11]. The Internal Combustion Engine (ICE) is well-defined as a heat engine produced by a 
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chemical process from the mixing of fuel and air. Most ICEs are four-stroke operation engines: intake, 
compression, power, and exhaust. ICE is observed as an almost certainly component to fail [12]-[14]. 

An overworked engine without planned maintenance can lead to engine failure, and it is also likely 
to end up with terrible loss [15]—[17]. When an engine component does not perform smoothly, diagnostic 
activities should be solved to discover the possible cause. Many faults can be spotted by aiming at the parts, 
but predicting the early symptoms of faults should be taken to avoid the failure from repeating in the 
forthcoming [12], [18], [19]. Lei et al. [20] have completed a wide-ranging review on an application of 
machine learning methods from previous, current and upcoming trends. Generally, it demonstrates that 
statistical signal analysis applies from the past to the future of fault diagnosis, together with machine learning 
and deep learning [21], [22]. Gardner et al. [23] have finished works on structural health monitoring (SHM) 
using machine learning by alteration techniques of three domains namely transfer component analysis (TCA), 
joint domain adaption (JDA) and adaptation regularization based transfer learning (ARTL). That 
three-domain is knowingly to lessen data distribution ranges and improve classification accuracy. 

Techniques to identify faults that happened in an engine has been established by [24] shown that 
time and frequency contents analysis and I-kaz method by applying an ultrasonic sensor, and strain gauge can 
trace the faults in cylinder block during driving. Rizvi et al. [25] discovered that the receiver operating 
characteristic (ROC) analysis and Markov process can sense the misfire in the sparking area. Faults can 
happen in any component of the engine [26]. Many investigators study the engine with faults such as spark 
plug misfire, valve head crack, cylinder wall damages, and spring failures [18], [27]—[29]. In the meantime, 
several current researchers discovered any type of faults such as abnormal fuel injector, misfiring, shaft 
imbalance, clogged intake filter and leaking spark plug, abnormal valve clearance, and piston ring fault. 
Spark plug misfire in the engine can affect vibration in the engine compartment [30]. A misfire can cause 
vibrations regularly or intermittently. Babu et al. [31] indicated that vibrations in cylinder heads happen 
when the spark plug misfires. 

Vibration analysis is a proficient method in recognizing the faults in engine cylinders for both time 
and frequency content analysis [27], [32]. Referring to [33], the increase in engine vibration is because of 
piston scratches. An investigation showed that the piston scratches had substantial impacts on the engine 
vibration. Researcher [33] investigated the effect of abrasive piston fault on the engine over a vibration 
formed. During misfire activities, an inadequate amount of power is formed and engine exhaust emission 
nitrogen oxide NOx, carbon monoxide, CO and hydrocarbon, HC are increased [31], [34]—[36]. The machine 
learning method has been used to trace misfire. Amongst the methods, MultiClass Classifier was established to 
be better presentation among other algorithms such as AdaBoost, LogitBoost and J48 [31]. Jafarian et al. [9] 
achieved fault analysis on the rotary mechanism by examining the vibration signal produced from the engine. 
The technique used by this study is the use of fast Fourier transform (FFT) and feature extraction approach to 
extract onto 16 features before it can be classified using machine learning algorithm methods such as 
artificial neural networks (ANN), support vector machines (SVM), and k nearest neighbor (KNN) [37]. It was 
found that the performance of ANN, SVM and KNN are to be substantial in stress the engine faults. Finally, 
until now there is no information found so far that using machine learning in detecting both misfire and other 
faults in an engine [38]. 


2. RESEARCH METHOD 
2.1. Statistical signal analysis method 

Most signals from acquisition activities consist of non-deterministic features which lead to a 
challenge for researchers while using signal processing techniques [31], [39]. The Z-freq statistical signal 
analysis technique was proposed to extract data from this random signal. Then, the signal from the collected 
data is converted into the frequency domain using fast Fourier transform (FFT) before that signal can be 
examined by applying the derived statistical method to calculate the Z-freq coefficient value. The higher 
kurtosis values designate the existence of extreme values found in a Gaussian distribution. Kurtosis is used in 
industry to trace damage signs due to its sensitivity to high amplitude signals [40], [41]. The Z-freq technique is 
a graphical depiction of the collected signal frequency distribution in agreement with kurtosis. The time-domain 
signal is decomposed into two frequency mobs, in which an x-axis denotes low frequency (affix), and a y-axis 
represents high frequency (annex). Affix mob consists of frequency mob from 0 to 0.5fmax, and annex mob 
consists of frequency mob from 0.5fmax to 1.0fmax. In application to quantity the scattering of data, the Z-freq 
coefficient computes the range of every data point from the data centroid. Z-freq coefficient is defined by: 


1 
ZI = a [KafxSafx + KanxSanx (1) 
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where Kaf, and sag are the kurtosis and standard deviation for the low-frequency range, respectively while 
Kanx, and Sanx are the kurtosis and standard deviation for the high-frequency range, respectively. Z/ coefficient 
is formulated based on the normal order of Daubechies signal decomposition. 


2.2. Support vector machine (SVM) 

Support vector machine (SVM) is commonly used as a prediction tool in machine learning to 
forecasts stock markets, weather activities, and other failures of events. This supervised learning model is 
also used for classification purposes. It is also a well-known AI algorithm in pattern recognition to estimate 
and study linear predictors in high dimensional feature space [42]—[45]. It is easier to classify and observe the 
engine vibration data as well as the Z-freq coefficients with this leading capability. The training data for 
normal and misfire faults in the collected data were randomly chosen for all cases to monitor the 
classification. Lastly, a calculation for machine learning system of Z-freq coefficient is shown in Figure 1. 


Z-freq Dataset Fusion 
7 of 
Fault Signal - Result 
X 


Reference Data X 


----> Random select i; Final Diagnosis 
$ Result 


——— >» Complete select 





Figure 1. Effects of selecting different switching under dynamic condition 


2.3. k-nearest neighbor (kNN) 

In a machine learning system, k-nearest neighbor (kNN) is a classification technique to train 
selected features and class labels to classify the existence of features. The process of finding the closest 
neighbor can be extremely fast at some conditions because it is predicated on the assumption that the features 
are relevant to its labelling [9], [43], [44], [46], [47]. Three types of classification performance metrics used 
to measure the performance of the classifiers are accuracy, sensitivity, and specificity [48], [49]. In this 
binary classification, the prediction produced four different situations: true positive (TP), true negative (TN), 
false positive (FP), and false negative (FN). True and false annotations from the confusion matrix 
representations are to evaluate the prediction whether it is a correct or incorrect event, while the positive and 
negative annotations indicate detection or undetected fault. From the evaluation of the classifiers' 
performance, the ‘accuracy’ measures the classified data point ratio, the ‘sensitivity’ measures the number of 
fault detection, while the ‘Specificity’ measures the health of all calculated data. These are given by: 


Accuracy = (TP + TN)/(TP + TN + FP + FN) (2) 
Sensitivity = TP/(TP + FN) (3) 
Specificity = TN/(TN + FP) (4) 


2.4. Experimental setup and signal analysis 

Four accelerometers were used to acquire the vibration signal of the engine. The engine has four 
cylinders, a 2.0L cubic cube, and 16-valves double over head cam (DOHC). Two situations of the engine: 
normal condition and misfire condition. The experimental is shown in Figure 2, where Figure 2(a) shows test 
engine and Figure 2(b) shows sensor installation. Each condition run at five different speeds (750, 1000, 
1500, 2000, and 2500 rpm) to attain the data pattern while the speed was increased. The diagram for the 
experimental setup is shown in Figure 3(a). The heat produced from the engine operation can go up to 90° C 
due to frictions between the engine parts while the engine is running, and this can lead to inconsistent signal 
data recording. An additional rigid plate was used, to isolate the heat, and the filtering process was applied to 
diminish undesirable noise produced. The engine misfire analysis activities are illustrated in the diagram, as 
displayed in Figure 3(b). It starts with signal measurement from the engine wall by accelerometers for each 
cylinder. 

The outcomes made from this statistical analysis and Z-freq are denoted in a value categorized 
based on the engine situation. The normal and fault settings are correlated based on the graph pattern and the 
verification method is carry out using the machine learning technique. In this study, support vector machine 
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(SVM) and k-nearest neighbors as in were conducted to verify the trustworthiness of the Z-freq analysis. To 
measure the performance of the machine learning techniques, the accuracy, sensitivity and specificity as in 
are calculated (4)-(6). Finally, the calculated values are represented in a 2-D graphical scatter plot for 
visualizing the pattern recognition visualization. 





(a) | (b) 


Figure 2. Experimental setup: (a) engine test setup and (b) sensor installation 
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Figure 3. Experimental setup and flowchart: (a) diagram and (b) misfire faults diagnostic process flow using 
Z-freq method 


3. RESULTS AND DISCUSSION 

The experimental activity was logged using data acquisition at 25.6 kHz sampling frequency. This 
high-frequency sampling is to make sure that every occasion in the combustion processes from intake, 
compression, power, and exhaust strokes during the misfire fault can be observed. Figure 4 exhibits the 
experimental result for the time domain and frequency domain for six types of engine speeds (750 rpm, 1000 
rpm, 1500 rpm, 2000 rpm, 2500 rpm and 3000 rpm) for normal conditions. In contrast, the result shows the 
amplitude of the time domain and frequency domain increases as the engine speed rises. The frequency- 
domain graph is a filtered graph with Butterworth filter setting and a full raw signal will be applied for 
upcoming monitoring analysis. Furthermore, there are many additional existing frequencies in the graph to 
represent other engine parts vibration such as belting system and engine mounting. The signals of normal 
engine speeds affect the vibration characteristic essentially. 
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Figure 4. Experimental result for the time domain and frequency domain: (a) 1000 rpm and (b) 3000 rpm 


The below findings are preliminary analysis results of the new Z-freq technique. As an introductory 
fact, this result emphasis on the frequency content of each signal acquired. Then, the value of Z-freq was 
generated by using the formulated equation, and the distribution of values is shown in the graph. Figure 5 
presents the scattering of data of Z-freq value for every engine speed: (a) 1000 rpm and (b) 3000 rpm. The 
data presentation from this graph designates that the distribution of data becomes broader as the engine speed 
rises. Each colour represents the frequency pair between low and high frequency for each signal that was 
separated using the Daubechies method. One interesting finding is the scattering of the red region becomes 
wider as the engine speed increases from 750 rpm to 3000 rpm. This result also indicates that the value of 
Z-freq also increases considerably as the speed increases. 

The result of this study shows the overall pattern of the Z-freq coefficient for normal conditions of 
six BMW engine speeds (750 rpm, 1000 rpm, 1500 rpm, 2000 rpm, 2500 rpm, and 3000 rpm) as depicted in 
Figure 6(a). The graph represents an increase in the Z-freq coefficient as the engine speed rose. However, the 
value of the coefficient rises drastically at 3000 rpm speed to show that the engine starts to become unstable 
at certain components. The graph shown in Figure 6(b), depicted the Z-freq coefficient versus all channels 
observed for all speeds. It can be concluded that the pattern on Z-freq values is constant for every engine 
speed and it proved that the engine condition is normal for all cylinders. However, the value range is different 
compared to each speed and it decided the increment of vibration amplitudes over speed. The increment 
becomes few times from the previous value, also to show the instability of the system or unbalance was 
existed. 

Looking at Figures 7(a) and 7(b), it can be seen that when the piston at cylinder C2 and C2 run with 
misfire fault in it, the value of Z-freq coefficient will reduce about half of its normal value. It easily can be 
identified that the vibration is in an abnormal situation. Table 1 exhibits the Z-freq values for the normal and 
sparks plug misfire conditions. The examined result shows the variations in value for the misfire condition. 
The table shows that the value dropped approximately half if the misfire event occurred to the cylinder 
compared to the nearby normal cylinder. 
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Figure 5. Scattering of data of Z-freq value for every engine speed: (a) 1000 rpm and (b) 3000 rpm 
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Figure 6. Pattern of the Z-freq coefficient for normal conditions: (a) for all engine speed and (b) for all engine 
speed and cylinders 
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Figure 7. Piston at cylinder C2 and C2 run with Z-freq coefficient for: (a) spark plug misfire at C2 and 
(b) spark plug misfire at C3 


Table 1. Summary of Z-freq coefficient for spark plug misfire condition 


Engine speed (1000 rpm) 
Engine Cylinder 
Misfire Cl C2 C3 C4 


1000 11.85 21.87 23.84 19.41 
1500 24.20 15.20 27.20 26.20 
2000 73.31 70.43 47.30 78.80 
2500 126.14 136.70 129.50 69.50 
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Tables 2 and 3 are the SVM and kNN confusion matrix of the monitoring test with different inputs 
and two target outputs, respectively. This data about faults is important to technical experts or engineers in 
enhancing the diagnostic activities. The perfect classification ratio for the KNN technique is better than the 
ratio for SVM, as tabulated in both tables. 

From Table 4, the percentage of performance metrics for all conditions are shown, and it proves that 
both classification techniques are substantial for all normal and fault. Generally, the average accuracy, 
sensitivity, and specificity for KNN are better than SVM and determine the right classification on analysis 
selected in this investigation. KNN depicts the better result compared to SVM in all performance metrics with 
a reported 90.6 % accuracy, 100.0 % sensitivity, and 81.3 % specificity for normal condition and 90.0 % 
accuracy, 97.9 % sensitivity, and 81.8 % for misfire conditions. The result also defines the capability of the 
Z-freq analysis technique to forecast the faults for misfire and other faults probably. 


Table 2. Confusion matrix of test for SVM Table 3. Confusion matrix of test for KNN 
Condition of Gasoline Engine Condition of Gasoline Engine 
Target/Output (using SVM) Target/Output (using KNN) 
Normal Misfire Normal Misfire 
Normal 100.00 % 0.00 % Normal 100.00 % 0.00 % 
Misfire 14.60 % 85.40 % Misfire 2.10 % 97.90 % 


Table 4. Performance metrics for both machine learning techniques 


Condition DM tes J AAN T 
Accuracy (%) Sensitivity (%) Specificity (%) Accuracy (%) Sensitivity (%) Specificity (%) 
Normal 80.6 100.0 61.1 90.6 100.0 81.3 
Misfire 75.7 85.4 66.0 90.0 97.9 81.8 


The present investigation system was successfully developed to diagnose the internal combustion 
engine fault symptom called detection of an engine misfire. The study and diagnostic activity achievement 
were proved effectively by the application of the new Z-freq technique which focuses on the frequency 
domain of the raw vibration signal acquired at the combustion chamber using an accelerometer. The most 
significant result was that the Z-freq method was able to spot a set of normal engine speeds, various cylinder 
misfires. Through the graphical representation and coefficient value, engineers or specialists can easily define 
the conditions of the engine and fault diagnosis mainly due to failure effects. This result of Z-freq analysis 
was validated by the machine learning methods with high-performance metrics. This study was set out the 
function of piezo-based sensors in applying novel statistical signal analysis-based for internal combustion 
engine fault monitoring. The evidence from this study summarizes that study was accomplished according to 
the hypothesis through the experimental program, analysis, and verification methods and can be applied for 
future prediction programs. 
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